Media Handling

Media behavior includes anything related to the establishment, management and termination of media sessions within the SIP protocol. Media sessions are created using the SIP offer-answer mechanism. If successful, the result is a bi-directional media (RTP) flow (e.g. audio, fax, modem, DTMF). Each offer-answer may create multiple media sessions of different types (e.g. audio and fax). In a SIP dialog, multiple offer-answer transactions may occur and each may change the media session characteristics (e.g. IP address, port, coders, media types, and RTP mode).

The media capabilities exchanged in an offer-answer transaction include the following:

Media types (e.g., audio, secure audio, video, fax, and text).
IP addresses and ports of the media flow.
Media flow mode (send receive, receive only, send only, inactive).
Media coders (coders and their characteristics used in each media flow).
Other (standard or proprietary) media and session characteristics.

Typically, the device doesn't change the negotiated media capabilities (mainly performed by the remote user agents). However, it does examine and may take an active role in the SDP offer-answer mechanism. This is done mainly to anchor the media to the device (default) and also to change the negotiated media type, if configured. Some of the media handling features, which are described later in this section, include the following:

Media anchoring (default).
Direct media (see Direct Media Calls).
Audio coders restrictions.
Audio coders transcoding.
RTP-SRTP transcoding.
DTMF translations.
Fax translations and detection.
Early media and ringback tone handling.
Call hold translations and held tone generation.
NAT traversal.
RTP broken connections.
Media firewall:
RTP pin holes - only RTP packets related to a successful offer-answer negotiation traverse the device: When the device initializes, there are no RTP pin holes opened. This means that each RTP\RTCP packets destined to the device are discarded. Once an offer-answer transaction ends successfully, an RTP pin hole is opened and RTP\RTCP flows between the two remote user agents. Once a pin hole is opened, the payload type and RTP header version is validated for each packet. RTP pin holes close if one of the associated SIP dialogs is closed (may also be due to broken connection).
Late rogue detection - once a dialog is disconnected, the related pin holes also disconnect.
Deep Packet inspection of the RTP that flows through the opened pin holes.